388 research outputs found

    The Evolution of First Person Vision Methods: A Survey

    Full text link
    The emergence of new wearable technologies such as action cameras and smart-glasses has increased the interest of computer vision scientists in the First Person perspective. Nowadays, this field is attracting attention and investments of companies aiming to develop commercial devices with First Person Vision recording capabilities. Due to this interest, an increasing demand of methods to process these videos, possibly in real-time, is expected. Current approaches present a particular combinations of different image features and quantitative methods to accomplish specific objectives like object detection, activity recognition, user machine interaction and so on. This paper summarizes the evolution of the state of the art in First Person Vision video analysis between 1997 and 2014, highlighting, among others, most commonly used features, methods, challenges and opportunities within the field.Comment: First Person Vision, Egocentric Vision, Wearable Devices, Smart Glasses, Computer Vision, Video Analytics, Human-machine Interactio

    Left/Right Hand Segmentation in Egocentric Videos

    Full text link
    Wearable cameras allow people to record their daily activities from a user-centered (First Person Vision) perspective. Due to their favorable location, wearable cameras frequently capture the hands of the user, and may thus represent a promising user-machine interaction tool for different applications. Existent First Person Vision methods handle hand segmentation as a background-foreground problem, ignoring two important facts: i) hands are not a single "skin-like" moving element, but a pair of interacting cooperative entities, ii) close hand interactions may lead to hand-to-hand occlusions and, as a consequence, create a single hand-like segment. These facts complicate a proper understanding of hand movements and interactions. Our approach extends traditional background-foreground strategies, by including a hand-identification step (left-right) based on a Maxwell distribution of angle and position. Hand-to-hand occlusions are addressed by exploiting temporal superpixels. The experimental results show that, in addition to a reliable left/right hand-segmentation, our approach considerably improves the traditional background-foreground hand-segmentation

    Average consensus-based asynchronous tracking

    Get PDF
    Target tracking in a network of wireless cameras may fail if data are captured or exchanged asynchronously. Unlike traditional sensor networks, video processing may generate significant delays that also vary from camera to camera. Moreover, the continuous and rapid change of the dynamics of the consensus variable (the target state) makes tracking even more challenging under these conditions. To address this problem, we propose a consensus approach that enables each camera to predict information of other cameras with respect to its own capturing time-stamp based on the received information. This prediction is key to compensate for asynchronous data exchanges. Simulations show the performance improvement with the proposed approach compared to the state of the art in the presence of asynchronous frame captures and random processing delays

    Unsupervised Understanding of Location and Illumination Changes in Egocentric Videos

    Full text link
    Wearable cameras stand out as one of the most promising devices for the upcoming years, and as a consequence, the demand of computer algorithms to automatically understand the videos recorded with them is increasing quickly. An automatic understanding of these videos is not an easy task, and its mobile nature implies important challenges to be faced, such as the changing light conditions and the unrestricted locations recorded. This paper proposes an unsupervised strategy based on global features and manifold learning to endow wearable cameras with contextual information regarding the light conditions and the location captured. Results show that non-linear manifold methods can capture contextual patterns from global features without compromising large computational resources. The proposed strategy is used, as an application case, as a switching mechanism to improve the hand-detection problem in egocentric videos.Comment: Submitted for publicatio

    Advantages of dynamic analysis in HOG-PCA feature space for video moving object classification

    Get PDF
    Classification of moving objects for video surveillance applications still remains a challenging problem due to the video inherently changing conditions such as lighting or resolution. This paper proposes a new approach for vehicle/pedestrian object classification based on the learning of a static kNN classifier, a dynamic Hidden Markov Model (HMM)-based classifier, and the definition of a fusion rule that combines the two outputs. The main novelty consists in the study of the dynamic aspects of the moving objects by analysing the trajectories of the features followed in the HOG-PCA feature space, instead of the classical trajectory study based on the frame coordinates. The complete hybrid system was tested on the VIRAT database and worked in real time, yielding up to 100% peak accuracy rate in the tested video sequences

    Use of Time-Frequency Analysis and Neural Networks for Mode Identification in a Wireless Software-Defined Radio Approach

    Get PDF
    The use of time-frequency distributions is proposed as a nonlinear signal processing technique that is combined with a pattern recognition approach to identify superimposed transmission modes in a reconfigurable wireless terminal based on software-defined radio techniques. In particular, a software-defined radio receiver is described aiming at the identification of two coexistent communication modes: frequency hopping code division multiple access and direct sequence code division multiple access. As a case study, two standards, based on the previous modes and operating in the same band (industrial, scientific, and medical), are considered: IEEE WLAN 802.11b (direct sequence) and Bluetooth (frequency hopping). Neural classifiers are used to obtain identification results. A comparison between two different neural classifiers is made in terms of relative error frequency

    Advanced Video-Based Surveillance

    Get PDF
    Over the past decade, we have witnessed a tremendous growth in the demand for personal security and defense of vital infrastructure throughout the world. At the same time, rapid advances in video-based surveillance have emerged and offered a strategic technology to address the demands imposed by security applications. These events have led to a massive research effort devoted to the development of effective and reliable surveillance systems endowed with intelligent video-processing capabilities. As a result, advanced video-based surveillance systems have been developed by research groups from academia and industry alike. In broad terms, advanced video-based surveillance could be described as intelligent video processing designed to assist security personnel by providing reliable real-time alerts and to support efficient video analysis for forensics investigations

    A bio-inspired logical process for saliency detections in cognitive crowd monitoring

    Get PDF
    It is well known from physiological studies that the level of human attention for adult individuals rapidly decreases after five to twenty minutes [1]. Attention retention for a surveillance operator represents a crucial aspect in Video Surveillance applications and could have a significant impact in identifying relevance, especially in crowded situations. In this field, advanced mechanisms for selection and extraction of saliency information can improve the performances of autonomous video surveillance systems and increase the effectiveness of human operator support. In particular, crowd monitoring represents a central aspect in many practical applications for managing and preventing emergencies due to panic and overcrowding
    • 

    corecore